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' ^ ^NHIBITORS OF NUCLEAR PROTEIN: NUCLEAR RECEPTOR INTERACTION^ 


The preseni invention relates to inhibition of the interaction between nuclear proteins 
and nuclear receptors through identification of the key structural element responsible for the 
5 interaction. 

The binding of lipophilic hormones, retinoids and vitamins to members of the nuclear 
receptor (NR) superfamily (to form "liganded" receptors) modifies their DNA binding and 
transcriptional properties, resulting in the activation or repression of target genes Ligand 
binding induces conformational changes in NRs and promotes their association with a diverse 
10 group of nuclear proteins, including SRC-l/pl60 3.^.5, TIF2 6.7 and CBP/p300 4.5.8.9 which 
function as coactivators. and RJP-MO TIFl n and TRIPI/SUGI I2. I3 whose functions are 
unclear. 

The recruitment of nuclear proteins (coactivators and/or other so-called bridging 
proteins) by NRs is thought to be essential to their function as ligand-induced transcription 

15 factors. Structural studies of the ligand binding domains (LBDs) of three different nuclear 
hormone receptors, the retinoid X receptor a (RXRa) 15. the retinoic acid receptor y (RARy) 
»6 and the thyroid hormone receptor p (TRP) have led to the proposal that binding of 
ligand results in a realignment of a conserved amphipathic a-helix. Helix 12 (H12), 
generating a novel surface required for coactivator binding and consequently activator 

20 function 2 ( AF2)-dependent iransactivation. Consistent with this, mutations of conserved 
hydrophobic residues in HI 2 which impair AF2 i^. 18-20, also interfere with the ability of NRs 
to bind coactivators 4.6.io.i i.i3. Less is known about the coactivator sequences which mediate 
interaction with NRs although several proteins appear to contain multiple NR binding sites 
5.8.21. Le Douarin et a/ (1996) in EMBO Journal, 15, 6701-6715, identified a leucine rich 

25 region in three coactivators (TIFK RIP140 & TRIP3) which they called the **NR box"; see 
Figure 3D therein. However the present state of knowledge is completely silent about 
precisely how liganded nuclear receptors interact with nuclear proteins as a class to modify 
their DNA binding and transcriptional properties, resulting in the activation or repression of 
target genes. Indeed a commentator on the field stated, after the first filing date of the present 

30 invention, that ^''characterizing (he mechanisms by which nuclear factors engage the 

transcriptional apparatus in response to hormonal stimulaiion has seemed, at times, to be an 
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insurmountable task"' (Marc Montminy in Nature, 12*** June, 1997, 387, 654-655. see 1" 
paragraph thereof)- 

The present invention is based on the discovery that a short signature motif present in 
the nuclear proteins is necessary and sufficient to mediate their binding to liganded NRs. 
5 According to one aspect of the present invention there is provided a method for 

identifying inhibitor compounds capable of reducing the interaction between: 

a) a first region which is a signature motif on a nuclear protein, and 

b) a second region which is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 

10 wherein: 

the nuclear protein is a bridging factor that is responsible for the interaction between a 
liganded nuclear receptor and a u-anscription initiation complex involved in regulation of gene 
expression; 

the nuclear receptor is a transcription factor; 
1 5 the signature motif is a short sequence of amino acid residues which is the key structural 

element of a nuclear protein which binds to a liganded nuclear receptor as part of the process 

of the activation or repression of target genes; and 

in which the method comprises taking: 

i) the potential inhibitor compound; 
20 ii) the liganded nuclear receptor or a fragment thereof in which the fragment comprises 

the second region defined in this claim in b) above; 

iii) a nuclear protein fragment comprising a signature motif of the nuclear protein; and 

iv) detecting the presence or absence of inhibition of the interaction between ii) and iii). 
The term '^nuclear protein" means the bridging factors (including coactivators) that are 

25 responsible for the interaction between a liganded nuclear receptor and the transcription 
initiation complex involved in regulation of gene expression (reviewed for steroid hormone 
receptors in Beato, M., Herrlich, P. & Schutz, G, Cell 83, 851-857 (1995)). The term 
bridging factor may include part of the transcription initiation complex itself. The term 
"nuclear receptor" means the family of nuclear receptors such as described in Mangelsdorf, 

30 D.J., et al. Cell 83, 835-839 (1995). The term "signature motif means a short sequence of 
generally at least about 5 amino acids, preferably 4-10, more preferably 5-10 amino acid 
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residues which is the key structural element of a nuclear protein which binds to a liganded 
nuclear receptor as part of the process of the activation or repression of target genes. The term 
"liganded nuclear receptor" means an activated nuclear receptor, for example, with ligand 
bound thereto. The ligand can take various forms such as for example a hormone, a small 
5 molecule compound or a peptide. For example, in the case of some of the nuclear receptors 
(e.g. PPAR) there are non hormonal peptide ligands e.g. leukotrienes. Although nuclear 
receptors are generally activated through ligand binding, some receptors, such as for example 
orphan receptors, may be active without the need for ligand and/ or activated through non- 
ligand dependent pathways and these receptors are also within the scope of the invention as a 

10 less preferred embodiment. 

The term ^'fragment'* means an incomplete part. Before the present invention, a skilled 
person could not have known which fragment or fragments of a nuclear protein could be taken 
to retain activity. Use of fragments compared with whole proteins is particularly 
advantageous in screening assays. In a preferred embodiment a preferred fragment size of a 

15 nuclear protein is 8-10 amino acids, such as, for example, shown in Figure 1 A herein. The 
liganded nuclear receptor is preferably in the form of a fragment. In general, fragments 
comprise at least 8 amino acids. 

Preferably the signature motif is represented by B^XXLL in which RJ is any natural 
hydrophobic amino acid, L is leucine and X represents any natural amino acid. Values for X 

20 within the signature motif are independendy selected i.e. X may be the same or different. 
Preferably is leucine or valine with leucine being most preferred. In some instances the 
preferred signature motif is further defined as B2B»XXLL wherein "B2" is a hydrophobic 
amino acid residue as defined for Bi. A "natural hydrophobic amino acid" is defined as any 
one of isoleucine, leucine, methionine, phenylalanine, tryptophan, tyrosine or valine, 

25 Preferably the signature motif is in the conformation of a helix, preferably an amphipathic 
helix, and the leucine residues form a hydrophobic face thereof. Preferably the signature 
motif is positioned within a molecule so that it is available at the surface thereof for 
interaction with proteins. Preferably values of X do not include Cys or Pro. Preferably at 
least one value of X is not a natural hydrophobic amino acid or proline. One value of X is 

30 preferably independently selected from Arg, Asn, Asp, Glu, Gin, His, Lys, Ser, Thr, Gly or 
Ala, more preferably one value of X is independently selected ft-om Arg, Asn, Asp, Glu, Gin, 
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His or Lys. Without wishing to be bound by theoretical considerations, it is believed that 
preferred values of X favour the signature motif forming an am'phipathic helix. 

After the first filing date of the present invention there was a simultaneous publication 
of the identification of LXXLL motifs by two groups (one of which included the inventors of 
5 the present invention), namely Heery et aL Nature, 12*'' June 1997, 387 , 733-736 & Torchia e( 
al. Nature, 12*' June 1997, 387, 677-684. 

Herein we show that the ability of a nuclear protein (SRC 1 ) to bind a nuclear receptor 
(liganded ER) and enhance its transcriptional activity is dependent upon the integrity within 
the nuclear protein of the signature motif (LXXLL; SEQ ID NO: 1), as well as key 

10 hydrophobic residues in the conserved helix (Helix 12) of NRs required for their ligand- 
induced activation function (AF-2) The signature motif is also found in TIFl, TIF2, p300, 
RIP 1 40 and the TRIP proteins, and occurs within regions of these proteins known to be 
sufficient for interaction with NRs. Thus the LXXLL motif (SEQ ID NO: 1) is a signature 
sequence which facilitates the interaction of diverse proteins with nuclear receptors, and thus 

15 is a key part of a new family of nuclear proteins. 

A preferred nuclear protein is a coactivator, in particular the nuclear protein includes 
any one of RIP 140, SRC-K TIF2, CBP, p300, TIFl, Tripl, Trip2, Trip3, Trip4, Trip5, Trip8 
or Trip9. Funher preferred nuclear proteins include p/CIP, ARA70 & Trip230. 

In this specification a reference to a nuclear protein or nuclear receptor includes 

20 isoforms thereof unless stated or otherwise implicit from the context. An isoform is one of a 
family or collection of related proteins derived from a single gene. Thus isoforms may differ 
slightly in their amino acid sequences such as for example from differential splicing of exons 
following transcription. SRC la is an example of an isoform of SRCl . Two isoforms of SRC- 
1, namely SRC la and SRCle, have been shown to contain differences in number of signature 

25 motifs and to be functionally distinct in so far as they appear to play different roles in ER- 
mediated transcription (Kalkhoven et aL 1998, EMBO Journal, 17, 232-243). 

Nuclear receptors are transcription factors. A preferred transcription factor comprises 
at least part of a conserved amphipathic a-helix, and especially preferred is retinoic acid 
receptor or a steroid hormone receptor. Preferred steroid hormone receptors are oestrogen 

30 receptor, progesterone receptor, androgen receptor and glucocorticoid receptor with oestrogen 
receptor being especially preferred. 
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Preferably the second region comprises at least part of a conserved amphipathic a- 
helix such as for example Helix 12 in the oestrogen receptor which is especially preferred. 

An especially preferred combination of nuclear receptor and nuclear protein is one in 
which the nuclear receptor is oestrogen receptor and the nuclear protein is selected from 
5 SRCl, TIF2, CBP and p300, with SRCl and especially SRCla being most preferred. 

A preferred method is in the form of a 2-hybrid assay system. Such assay systems are 
well known in the art; suitable references include Fields & Sternglanz (1994) TIG, August 
1994, 10, 286-292 and US patent 5283173. 

Any suitable assay design may be employed such as for example radioisotopic assay, 
10 scintillation proximity assay (reviewed by ND Cook, 1996, Drug Discovery Today, 1, 287- 
294) or fluorescence, particularly time resolved fluorescence, assay (reviewed by MV Rogers, 
1997, Drug Discovery Today, 2, 156-160). High-throughput screeening technologies have 
been reviewed by Houston & Banks in Current Opinion in Biotechnology 1997, 8, 734-740. 

In a preferred embodiment of the invention the potential inhibitor is in the form of a 
15 peptide libraiy based on a signature motif. 

Encoded peptide libraries generally have a maximum of 20 possible amino acids at 
any one position. In practice, using current technology, it is difficult to screen libraries of 
more than lO'- 10^ members, which means that it is difficult to randomise more than 6 
positions in a peptide. Hence, narrowing down the nuclear protein binding region to a signal 
20 motif is of great advantage if a peptide library approach is to be employed. 

A peptide library is a collection of peptides of varying sequences. There are in general 
two ways to generate peptide libraries (reviewed by Scott, 1992; Bimbaum and Mosbach, 
1992; Houghten, I993;see also Abelson, 1996). The first approach is to generate libraries in 
which positive peptides are identified through the sequencing of the peptides themselves. 
25 Mixtures of peptides may be chemically synthesised in such a way that the peptides are linked 
to beads, so that each bead contains only one peptide. If a bead is identified which contains a 
positive peptide, the bead may be recovered and the peptide identified by chemical 
sequencing. This approach was first demonstrated using the ability of antibodies to identify 
specific six amino acid peptides from mixtures (Lam e( al, 1991). The importance of using 
30 beads is that the identification event (in this case the antibodyrpeptide interaction) leads to the 
recovery of a bead which contains more peptide than that bound by the antibody itself. In a 
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related approach mixtures of free peptides can be synthesised and screened in pools: by using 
a deconvolution process, positive peptides are identified (Houghten e( al, 1991). 

The second approach is to generate libraries in which positive peptides are identified 
by sequencing a molecule which is associated in some way with the peptide. Peptides and 
5 some other molecule, such as a nucleic acid, can be cosynthesised on beads, so that each 
nucleic acid *^tags" the peptide found on the same bead. If a bead is identified which is 
positive, sequencing the nucleic acid will identify the peptide on this bead (Brenner and 
Lemer, 1992). Alternatively, peptide libraries can be generated using the gene expression 
machinery of living organisms. In this approach, it is not necessary to make a library of 
10 peptide molecules. Instead a library of DNA molecules is constructed, so that each molecule 
encodes a peptide with a different sequence. This encoded library must then be expressed in a 
suitable host organism in order that the peptide library be produced. The library is then 
screened. It is essential that the nucleic acid which encodes the library remain physically 
linked to the protein in some way. so that recovery, or identification, of the active peptides 
15 leads to recovery of the DNA which encodes these peptides. The sequences of the active 
peptides may then be deduced by sequencing of the DNA which encodes them. Several 
variations of this approach have been described. 

The mostly widely used version of this approach is to express peptides as part of the 
coat proteins of a virus such as MI 3. The viruses can be screened by the ability of this coat 
20 protein to bind target proteins (such as antibodies (Devlin et a/, 1990; Scott and Smith, 1990; 
Cwirla et al, 1 990) or receptors (the atrial natriuretic peptide receptor: Cunningham et al 
(1994) and the thrombopoietin receptor (Cwirla et al, 1997). The approach may also be used 
to find protease inhibitors through their ability to bind to proteases (Roberts et al\ 1992; 
Markland et al, 1996) as well as to find optimal substrates for proteases, such as stromelysin 
25 and matrilysin (Smith et al 1995) and subtilisin (Matthews and Wells, 1993) 

An intracellular approach to the generation of peptides that recognise certain proteins 
is to use the yeast two-hybrid system. In the two-hybrid system, interacting proteins are fused 
to domains of transcription factors. If a protein:protein interaction occurs, then transcription of 
a reponer gene is stimulated (Fields and Song, 1989). By making one component of the two 
30 hybrid system a peptide library, and selecting for cells in which reporter gene output occurs, it 
is possible to isolate peptides which bind to a target protein, this approach was used to 
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identify peptides which bound lo the retinoblastoma protein (Yang ei ah 1995). In a similar 
approach. Colas er al (1996) expressed the peptide library as a loop in the surface of the E. 
coli TrxA protein and isolated peptides which bound to cyclin-dependent kinase 2 (Cdk2). 
According to another aspect of the present invention there is provided a method of 
5 reducing the interaction between 

a) a first region which is a signature motif on a nuclear protein, and 

b) a second region which is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 

in which the method comprises adding an inhibitor in the presence of the nuclear receptor and 

10 the nuclear protein, the inhibitor being characterised in that it reduces the interaction between 
the first region of the nuclear protein and the second region of the nuclear receptor. 

According to another aspect of the present invention there is provided a novel inhibitor 
as described above. Preferably the inhibitor is a peptide, more preferably a peptide 
comprising the signature motif defined above, £md more preferably the peptide has less than 

15 15 amino acid residues. Especially preferred inhibitors are any one of the following peptides; 
PQAQQKSLLQQLLT (SEQ ID NO: 2), KLVQLLTTT (SEQ ID NO: 3), ILHRLLQE (SEQ 
ID NO: 4),orLLQQLLTE(SEQIDNO: 5). Peptides may be prepared using conventional 
techniques for example using solid phase synthesis and Fmoc chemistry. These peptides are 
expected to be useful in the treatment of oestrogen responsive tumours. Inhibitors of the 

20 invention are expected to be useful in the treatment of any disease mediated through 
interaction between a signature motif on a nuclear protein and a nuclear receptor. For 
example, suitable inhibitors are expected to be useful in treatment of cancer or inflammation. 

A novel inhibitor, for example, could be an antibody against a signature motif or a 
novel small molecule which binds to the signature motif or its complementary binding target 

25 (nuclear receptor second region) such that normal biological activity is prevented. Examples 
of small molecules include but are not limited to small peptides or peptide-like molecules. 

Whilst the signature motif is demonstrated herein to apply across nuclear proteins as a 
class it is expected that different nuclear receptors display both coactivator and signature 
motif preferences that contribute to specificity of hormonal response (Ding et a/, 1998, 

30 Molecular Endocrinology, 12, 302-313) which in turn points to selective pharmaceutical 


wo 98/49561 


PCT/GB98/01238 


-8- 

intervention opportunities. Figures 1 A and 5 below also indicate that individual motifs may 
differ in the strength to which they bind to nuclear receptors. 

Note the NR box of Le Douarin ct al (discussed above) did not disclose a signature 
motif within the meaning of the present invention because, for example, the NR box wdthin 
5 the meaning of Le Douarin would be present in at most only 4 of the 39 signature motifs 
identified by the present invention in Figures 3A & 4 (see below). Furthermore, Le Douarin 
et al did not even suggest inhibitors of nuclear receptor - nuclear protein interaction. 

The term "antibodies" is meant to include polyclonal antibodies, monoclonal 
antibodies, and the various types of antibody constructs such as for example F(ab')2, Fab and 
10 single chain Fv. Antibodies are defined to be specifically binding if they bind with a Kg of 
greater than or equal to about 10' M '. Affinity of binding can be determined using 
conventional techniques, for example those described by Scatchard et al., Ann. N. K Acad, 
ScL, 51: 660(1949). 

Polyclonal antibodies can be readily generated from a variety of sources, for example, 

15 horses, cows, goats, sheep, dogs, chickens, rabbits, mice or rats, using procedures that are 
well-known in the art. In general, immunogen is administered to the host animal typically 
through parenteral injection. The immunogenicity may be enhanced through the use of an 
adjuvant, for example, Freund's complete or incomplete adjuvant. Following booster 
immunizations, small samples of serum are collected and tested for reactivity. Examples of 

20 various assays useful for such determination include those described in: Antibodies: A 

Laboratory Manual, Harlow and Lane (eds.). Cold Spring Harbor Laboratory Press, 1988; as 
well as procedures such as coimtercurrent immuno-electrophoresis (CIEP), 
radioimmunoassay, radioinmiunoprecipitation, enzyme-linked immuno-sorbent assays 
(ELISA), dot blot assays, and sandwich assays, see U.S. Patent Nos. 4,376,1 10 and 4,486,530. 

25 Monoclonal antibodies may be readily prepared using well-known procedures, see for 

example, the procedures described in U.S. Patent Nos. 4,902,614, 4,543,439 and 4,41 1,993; 
Monoclonal Antibodies, Hybridomas: A New Dimension in Biological Analyses, Plenimi 
Press, Kennett, McKeam,.and Bechtol (eds.), (1980). 

Monoclonal antibodies can be produced using alternative techniques, such as those 

30 described by Alting-Mees et al.^ "Monoclonal Antibody Expression Libraries: A Rapid 
Alternative to Hybridomas", Strategies in Molecular Biology 3: 1-9 (1990) which is 
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incorporated herein by reference. Similarly, binding partners can be constructed using 
recombinant DNA techniques to incorporate the variable regions of a gene that encodes a 
specific binding antibody. Such a technique is described in Larrick et uL^ Biotechnology^ 7: 
394(1989). 

5 According to a further feature of the invention there is provided a pharmaceutical 

composition which comprises a novel inhibitor of the invention, or a pharmaceutically- 
acceptable salt thereof, in association with a pharmaceutically-acceptabie diluent or carrier. 

The composition may be in a form suitable for oral use, for example a tablet, capsule, 
aqueous or oily solution, suspension or emulsion; for topical use, for example a cream, 

10 ointment, gel or aqueous or oily solution or suspension; for nasal use, for example a snuff, 
nasal spray or nasal drops: for vaginal or rectal use, for example a suppository; for 
administration by inhalation, for example as a finely divided powder such as a dry powder, a 
microcrystalline form or a liquid aerosol; for sub-lingual or buccal use, for example a tablet or 
capsule: or for parenteral use (including intravenous, subcutaneous, intramuscular, 

15 intravascular or infusion), for example a sterile aqueous or oily solution or suspension. In 
general the above compositions may be prepared in a conventional manner using conventional 
excipients. For peptidic inhibitors, parenteral compositions are preferred. 

The amoimt of active ingredient that is combined with one or more excipients to 
produce a single dosage form will necessarily vary depending upon the host treated and the 

20 particular route of administration. For example, a formulation intended for oral 

administration to humans will generally contain, for example, from 0.5 mg to 2 g of active 
agent compounded with an appropriate and convenient amount of excipients which may vary 
from about 5 to about 98 percent by weight of the total composition. Dosage unit forms will 
generally contain about 1 mg to about 500 mg of an active ingredient. 

25 According to another aspect of the present invention there is provided a method of 

mapping nuclear receptor interaction domains in nuclear proteins in which the method 
comprises analysis of the sequence of a nuclear protein for the presence of signature motifs as 
defined herein in order to identify an interaction domain or a potential interaction domain. 
Preferably the analysis fiirther comprises analysis of any potential interaction domains 

30 identified thereby for a-helicity and/or surface accessibility. 


wo 98/49561 PCT/GB98/01238 

- 10- 

The invention is illustrated by the non-limiting Examples below in which, unless 
stated otherwise: temperatures are expressed in degrees Celsius: and peptide sequences are 
listed N-tcrminus to C-terminus. 

Figures la/lb show the interaction of LXXLL motifs derived from eoactivators 
5 with the ER. 

Figure la: Yeast two hybrid interactions of LXXLL motifs, derived from the proteins 
RIP 140, SRC la and CBP with the LBDs of wild type or mutant ER. The sequences of the 
LXXLL motifs in the DNA binding domain (DBD) fusion proteins are indicated. DBD- 
LXXLL proteins were coexpressed with AAD-ER or AAD-ER Mut, which consist of an 
10 acidic activation domain (AAD) fused to the LBD of the wild-type ER, or a transcriptionally 
defective ER mutant, respectively. Reporter activities were determined in the presence or 
absence of I O-^M 1 7-p-estradiol (E2) and expressed as units of P-galactosidase activity. 
Sequences listed in Figure la, from top to bottom, are listed as SEQ ID NO: 6-23 
respectively. 

15 Figure lb: Effects of mutations in the RIP 140 LXXLL motif located at amino acids 935-943 
on binding of AAD-ER. Conserved leucine residues are boxed and mutated residues are 
circled. The reporter activity was determined in the presence (black bars) or absence (white 
bars) of 10-7M E2. Sequences listed in Figure lb, from top to bottom, are listed as SEQ ID 
NO: 24-32 respectively. 

20 Figures 2a/2b/2c show that LXXLL motifs are required for binding of SRCl to 

the ER LBD in vitro and for the ability of SRCl to enhance ER activity in viva. 
Figure 2a: Wild type (SRC la) and mutant (SRC la-M 1234) SRCl proteins are shown 
schematically. The black bars represent the approximate locations of the LXXLL binding 
motifs in the linear SRC la sequence and the shaded circles indicate the mutation of LXXLL 

25 binding motifs by replacement of conserved leucine residues with alanines (see Methods). 
Binding of wild type SRCl a or SRC la-M 1234 mutant to glutathione S transferase (GST) 
alone, to the ligand binding domain (aa 3 1 3-599) of ER (GST-AF2), or the SRCl binding 
domain (aa 2058-2163) of CBP (GST-CBP) in the presence (+) and absence (-) of lO-^M E2. 
The signals obtained with 10% of the input of p5S] -labelled wild type and mutant SRCl 

30 proteins are shown. 


wo 98/49561 


PCT/GB98/01238 


- 11 - 


Figure 2b: The ability of increasing amounts of the peptides P-1 (SEQ ID NO: 2) and P.2 
(SEQ ID NO: 72) to compete against the binding of wild type SRC la to GST-AF2 in the 
presence of ligand is shown. The sequences of the P-l and P-2 peptides are given at the foot 
of Fig 2b, and the conserved leucines and alanine substitutions are boxed. 
5 Figure 2c: Wild type but not mutant SRCle M123 potentiates activation by ER of the 
reporter gene 2ERE-pS2-CAT in transiently iransfected Heia cells. Reporter activities 
obtained from extracts of transfected ceils grown in the absence (white columns) or presence 
(black columns) of ligand (10-8M E2). The amounts of ER, SRCl-wt and and SRCl-mut 
expression plasmids used in the transfections are indicated below the graph. The activities 
10 shown are averaged from duplicates. 

Figures 3a/3b & 4 show that the LXXLL sequence is a signature motif in proteins 
that bind the LBDs of NRs. 

Figure 3a: Alignment of LXXLL motif sequences present in human RIP 140 lO, human 
SRC la, mouse TIF2 6, mouse CBP 23.24 , pSOQ 33, mouse TIFl n and human TRIP proteins 
15 12. The conserved leucines are boxed and the amino acid numbers are given for each motif 
Sequences listed in Figure 3 A, from top lo bottom, are listed as SEQ ID NO: 33-61 
respectively. 

Figure 3b: Schematic representation of the incidence of LXXLL motifs (black bars) in the 
sequences of proteins which bind NRs. The amino acid boundaries of the known NR binding 
20 sites are also shown. 

Figure 4: Alignment of LXXLL motif sequences present in CBP, P300, p/CIP, ARA70 & 
TRIP 230. The conserved leucines are boxed and the amino acid numbers are given for each 
motif Sequences listed in Figure 4, from top to bottom, are listed as SEQ ID NO: 62-71 
respectively. 

25 Figure 5 shows that the precise nature of the LXXLL signature motif affects the 

strength of the interaction between SRC-la and the LBDs of ERa, ERp and GR. 
Figure 5a: The ERa LBD binds to the 4 signature motifs of SRC- la with differing affinities in 
yeast 2-hybrid assays (SRC- la motif 1-4 = SEQ ID 73-76). The order (decreasing afiFmity) is 
SRC- la motif 2> SRC- la motif 4 > SRC- la motif 1> SRC- la motif 3. 
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Figure 5b: The ERp LBD binds to the signature motifs of SRC- la with the same relative 
affinities seen with ERa. The order (decreasing affinity) is SRC- la motif 2 > SRC- la motif 
4> SRC- la motif 1 > SRC- la motif 3. 

Figure 5c: The GR LBD binds to the signature motifs of SRC- 1 with differing affinities and 
5 the rank order of affinities differs to those seen with ERa and ERp. The order (decreasing 

affinity) is SRC-a motif 4 > SRC-la motif 1 = SRC-la motif 2 = SRC-la motif 3. 

In Figure 5 the y-axis units represent relative /3-galactosidase activity. In Figiwe 5 the x-axis 

motif numbering only indicates part of the actual sequence used which was as follows: 627- 

640, 684-696, 743-755 and 1428-1441. 
0 The following abbreviations are used. 


AAD 

acidic activation domain 

AF 

activator function 

DBD 

DNA binding domain 

E2 

1 7-p-estradioI 

ER 

estrogen receptor 

GR 

glucocorticoid receptor 

GST 

glutathione S transferase 

LBD 

ligand binding domain 

NR 

nuclear receptor 

PGR 

polymerase chain reaction 

RAR 

retinoic acid receptor 

RIP 

receptor interacting protein 

RXR 

retinoid X receptor 

SRC 

steroid receptor coactivator 

TIF 

transcriptional intermediary factor 

TRp 

thyroid hormone receptor p 


Standard amino acid abbreviations have been used. 
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A Inning 

A \ct 

A 

A rpininp 

A rrr 

P 

IV 

/\ b pa 1 1 lie 

Asn 

IN 


Asp 

U 


v^ys 


CrliitaTTiir AciH 

vJlw 


oiuiaiTiiiic 

oin 

V 


niv 
yjiy 

ri 
\j 

1 iisiiuinc 

riis 

IT 

rl 

jsoicucinc 

lie 

1 

i^eucinc 

Leu 

T 

L 

T v*;inf" 



\^f*thi on in/* 
IVidlliOJllllC 

IVICI 

\A 
ivi 

r iienyiaianinc 

1 ne 

r 

Proline 

Pro 

P 

Serine 

Ser 

S 

Threonine 

Thr 

T 

Tryptophan 

Tip 

W 

Tyrosine 

Tyr 

Y 

Valine 

Val 

V 

Any amino acid 

Xaa 

X 


Point mutations will be referred to as follows: natural amino acid (using the I letter 
nomenclature) , position, new amino acid. For example *'L636A" means that at position 636 
5 leucine (L) has been changed to alanine (A). Multiple mutations will be shown between 
square brackets. 
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Example 1 

Mapping of interaction sites between Nuclear Receptor and Nuclear Protein 

It has been previously demonstrated that the 140 kDa receptor interacting protein 
(RIP 140) bound directly to NRs through at least two distinct sites located at the N- and C- 
5 termini of the protein 2i, To map these interaction sites in more detail, we examined a series 
of twenty different PCR-generated fragments of RIP 140 coding sequence fused in frame with 
a heterologous DBD, for interaction with NRs in a two hybrid system. Remarkably, although 
the different constructs spanned the entire 1 158 amino acids of RIP140 sequence, all but two 
displayed ligand-dcpendent interaction with ER, including five non-overiapping RIP140 

10 sequences. By comparison of the sequences of the shortest interacting fragments we 

identified a short motif (LXXLL) common to all interacting fragments. In total, nine copies 
of the motif were identified in the RIP140 sequence, but the motif was absent in fragments 
showing no binding activity in our experiments. 

To determine if these short sequences were sufficient to bind to NRs, we constructed a 

1 5 series of proteins consisting of a DBD fused to eight to ten amino acids incorporating one 
copy of each of the nine LXXLL motifs. As shown in Fig. la, each of the nine motifs present 
in RIP 140 displayed strong ligand-dependent interaction with the LBD of the ER whereas the 
DBD alone showed no ability to bind. (Note that of the 10 motifs listed for RIP140, the 10^** is 
a repeat of the 9''' locus). Comparable results were obtained with the LBD of RAR (data not 

20 shown). Mutation of hydrophobic residues within HI 2 abolish AF2 activity and prevent the 
recruitment of RIP140 10, TIFl TIF2 6, SUGl !3andSRCl. Similarly, mutation of H12 
residues M543 and L544 in the ER abolished the ligand-dependent interaction of all nine 
LXXLL motifs with ER (Fig. la). Taking these results together, we conclude that a short 
conserved motif comprised within as little as eight amino acids is sufficient to bind to 

25 transcriptionally active NRs. This discovery that such a relatively small motif can affect the 
interaction between two relatively large molecules is unprecedented in this field. 

Secondary structure analysis using the Phd program 22 revealed that each of the nine 
copies of this motif in RIP 140 occurred within a region predicted to be a-helical in nature, in 
which the conserved leucines would form a hydrophobic face. 

30 


wo 98/49561 ^ 

PCT/GB98/01238 

- 15- 

Examnlc 2 

Mutational Analysis of a Signature Motif 

To determine the sequence constraints required to observe a functional interaction, we 
carried out a partial mutational analysis of one of the RIP140 motifs (amino acids 935-943; 
5 Fig.lb). While western blot analysis showed no significant variation in the expression of the 
wild type and mutant fusion proteins (data not shown), mutation of valine 935 to alanine 
resulted in approximately ten fold reduction in the reporter activity in the presence of ligand 
which, when coupled with the observation that the first amino acid is hydrophobic in seven of 
the nine LXXLL motifs in RJP140, may indicate a preference for a hydrophobic residue at 
10 this position. Strikingly, mutation of any one of the three conserved leucine residues L936, 
L939 or L940 to alanine resulted in a complete loss of binding to the LBD of ER (Fig. 1 b) and 
RAR (data not shown), emphasising their importance in mediating the interaction with NRs. 
In contrast, mutation to alanine of L941 (which is not conserved among the motifs; see Fig. 
3a), had no effect on the ability of this sequence to bind to the ER LBD. Replacement of a 
15 conserved leucine residue with a valine was tolerated at L936, but not at L939 or L940 
indicating that hydrophobic character alone is not sufficient to maintain an interaction with 
ER (Fig. lb). The amino acids K937, Q938, S942 and E943 were not subjected to 
mutagenesis as they are not conserved among the motifs we have identified (see Fig. 3a). 

20 Example 3 

Analysis of Signature Motifs in Nuclear Proteins 

The steroid receptor coactivator SRCl, which stimulates ligand-dependent 
transcriptional activity, was originally identified as a partial cDNA encoding a protein capable 
of interacting with the progesterone receptor by means of a 196 amino acid C-terminal region 

25 3. We noted that the eight most C-terminal amino acids fit the LXXLL consensus, and indeed 
this sequence (DBD-SRCla 1434-1441) displayed strong ligand-induced binding to ER, but 
not the ER H12 mutant (Fig. la). Subsequent studies have identified fiiU-length SRCl 
(SRC la) from mouse (1459 amino acids) 4.5 and human (1441 amino acids) tissues. Both 
murine and human SRC la proteins interact with multiple NRs, and contain an additional 

30 interaction region between residues 569 to 789 5 and 570-780, respectively. Three copies of 
the LXXLL motif were identified in this central interaction domain of human SRCla (see Fig. 
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3a & 3b), each of which displayed ligand-depcndent binding to both ER (Fig. la) and RAR 
(data not shown) in the two hybrid assay, but not tiic ER HI 2 mutant. Interestingly, the 
sequences and relative positions of the three motifs in the central domain of SRC la are 
conserved in the related coactivator protein Transcriptional Intermediary Factor 2 (TIF2) (Fig. 
5 3a + b), and correspond to the region of TIF2 known to bind to NRs 6, However, unlike 
SRC la, T1F2 appears to lack a motif at its C-terminus. In addition, we noted that SRCl 
contains three other sequences matching the LXXLL consensus. Although the motif at 
residues 45-53 is predicted by the Phd program to be a-helical and lies within the basic helix- 
loop-helix domain at the N-ierminus of SRC la 5, it showed only a very weak (6-fold) 
10 interaction with liganded ER (Fig. la) or RAR (not shown) in the yeast two hybrid assay. 
This is consistent with the observed absence of strong NR-binding activity associated with the 
N-terminus of SRCL The other two motifs within residues 111-118 and 912-920 both 
contain proline residues and are unlikely to adopt a-heiical structure according to the Phd 
program. Indeed, these sequences showed no detectable interaction with NRs in our binding 
15 assays (Fig. la), which strongly suggests a preference for appropriate secondary structure for 
binding of LXXLL sequences to NRs. 

Recent reports have indicated that CBP/p300 proteins, which were originally 
identified as coactivators for CREB 23,24^ are coactivators for many transcription factors 
including NRs 4,5.8,9 and may serve as integrators of several signalling pathways 4. CBP was 
20 shown to bind directly to NRs via its N-terminal 101 amino acids 4 , with a possible RXR- 
specific binding site between residues 356-495 8. Our analysis showed that the CBP sequence 
harbours copies of the LXXLL motif within positions 68-78, and 356-364, which are 
conserved in the p300 sequence (amino acids 80-90 and 341-351; Fig 3a). Indeed, when 
tested in the two hybrid assay, the N-terminus of CBP (amino acids I-IOI; data shown) and 
25 the LXXLL motif at residues 68-75 of the CBP sequence (Fig. la) displayed ligand-dependent 
binding to ER, but not the transciptionally defective ER mutant (Fig. la). 

Example 4 

The Binding Of Coactivator Proteins To NRs Is Dependent On Signature Motifs 

30 To demonstrate that the binding of coactivator proteins to NRs is dependent on 

LXXLL motifs, we introduced alanine substitutions in SRC 1 a at the conserved leucine 
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couplets at residues [L636A, L637A. L693A, L694A, L752A, L753A, L1438A, L1439A] 
thus effectively creating a mutant protein (SRCla-M1234) in which all the four functional 
binding motifs were disabled. We then compared the ability of ;« vitro translated SRCla and 
SRC la-M 1234 to bind to the ligand binding domain of the mouse estrogen receptor fused to 
5 glutathione-S-transferase (GST-AF2) in GST pulldown experiments (Maniatis et o/.,(1982) 
Molecular Cloning. A Laboratory Manual. Cold Spring Harbour Laboratory, Cold Spring 
Harbour, New York.). As shown in Fig. 2a, while wild type SRCla protein displayed ligand- 
dependent binding to GST-AF2, SRC la-M 1234 failed to bind to GST-AF2 either in the 
presence or absence of ligand. To confirm Uiat the mutations did not induce gross structural 
10 disruption of the SRC la-M 1234, we compared the ability of the m viiro translated proteins to 
interact with amino acids 2058-2163 of CBP, which was previously defined as the SRCl 
binding domain 4. Both proteins retained strong binding to GST-CBP (Fig. 2a) indicating that 
this SRCl function remained intact in both wild type and mutant proteins. In addition we 
showed that the binding of wild type SRC la to GST-AF2 was competed by increasing 
15 concentrations of a short peptide (P 1 ) corresponding to the motif at the C-terminus of SRCla 
(Fig. 2b). In contrast, a similar peptide (P2) in which the LXXLL. motif was mutated, or 
peptides unrelated to the LXXLL motif (data not shown), did not compete the binding of 
SRCla to GST-AF2 (Fig. 2b). 

Finally, to demonstrate that LXXLL motifs are necessary for the function of SRCl in ' 
20 vivo, we compared the abilities of wild type SRCl and a mutant protein in which all LXXLL 
motifs were disabled to enhance the activity of mouse ER in transient transfection 
experiments. As shown in Fig. 2c, wild type SRCl enhanced the activity of ER in a 
concentration-dependent manner. In contrast, the SRCl mutant, which was unable to bind ER 
(Fig. 2a) had no stimulatory effect, but reduced ER activity by up to 50% at the highest 
25 concentration (Fig. 2c). This apparent dominant negative property of the mutant SRCl is 
likely due to its ability to maintain interactions with CBP while failing to interact with NRs 
(Fig. 2a). This result is of interest given the recent evidence that SRCl and CBP/p300 may 
exist as a complex in vivo 9, and that CBP also has NR binding activity 4. 8, as our data 
suggest that the interactions between NRs and CBP are insufficient to compensate for the 
30 inability of the SRCl a mutant protein to bind NRs, at least under these conditions. It remains 
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to be determined whether NRs arc engaged simultaneously by pi 60 and p300 proteins 
functioning independently or as a complex. 

Examination of the sequences of other proteins known to bind to NRs revealed them to 
contain one or more copies of the LXXLL motif TIFl contains a single motif (residues 722- 
5 732) within the minimal region known to be required for its interaction with NRs i U 25, The 
truncated proteins TRIPs2-5, TRIP8 and TRIP9, which were isolated in a two hybrid screen 
for TR-interacting proteins i2, each contain at least one copy of the LXXLL motif (Fig. 3a), 
whereas the motif was absent in TRJPs whose interaction with TR was ligand-independent. 
An alignment of a selection of these sequences is shown in Fig. 3a, while Fig. 3b shows the 

10 incidence of motifs in the sequences of RIP140, SRCla, TIFL T1F2, CBP and p300, and the 
boundaries of known receptor interaction domains in these proteins. Interestingly, motifs 
were also identified in several other proteins for which evidence exists of interaction with 
NRs, including Ara70 26, SW13 27^ and the RelA (p65) subunit of NFic-B 28, although the 
receptor interaction domains in these proteins have not been mapped. The ability of other 

15 proteins containing LXXLL motifs to bind to NRs will depend on their subcellular 

localisation, as well as the a-heiicity and surface accessibility of the motifs. While it is clear 
that the conserved leucine residues are essential for the function of the motif, other amino 
acids may also be important given the degree of sequence conservation of equivalent motifs in 
SRC1/TIF2 or CBP/p300. 

20 As many NR binding proteins contain multiple copies of the LXXLL motif it remains 

to be established whether this facilitates the simultaneous contact of individual partners in 
homo- and heterodimers of NRs, or whether it serves to provide alternative interaction 
surfaces to accomodate conformational changes imposed by the binding of NRs to different 
response elements. The systematic mutation of LXXLL motifs in coactivators such as SRCl 

25 and CBP may allow us to decouple crosstalk or synergy between different signal transduction 
pathways, and thus provide a better understanding of their proposed roles as coactivators and 
integrators. 
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Example 5 

Two Hybrid Interaction Assays 

The yeast reporter strain used for all two hybrid assays was W303-1B (HMLa MATa 
HMRa his3-l 1, 15 trpl-1 ade2-l canl-100 leu2-3, 1 1, ura3) carrying the plasmid pRLA21- 
5 U3ERE which contains a lacZ reporter gene driven by three estrogen response elements 
(EREs) 29. The plasmids pBLI and pASV3 which express the human ER DNA binding 
domain (DBD) and the VP 16 acidic activation domain (AAD) respectively 30, were used to 
generate DBD or AAD fusion proteins for two hybrid interaction analyses. DBD-LXXLL 
motif fusion proteins were generated by ligation of phosphorylated, annealed oligonucleotide 

10 pairs into the pBLl vector. AAD-ER was constmcted by cloning a PCR fragment encoding 
amino acids 282-595 of the human ER into pASV3. AAD-ER Mut was constructed in a 
similar fashion except that the amino acids M543 and L544 or ER were mutated to alanines 
by recombinant PCR. All fusion constnicts were fully sequenced. Transformants containing 
the desired plasmids were obtained by selection for the appropriate plasmid markers and were 

15 grown to late log phase in 1 5 ml of selective medium (yeast nitrogen base containing 1% 
glucose and appropriate supplements) in the presence or absence of 10-7M 1 7-P-estradiol 
(E2). 

The expression of DBD- and AAD- fusion proteins in yeast cell-free extracts was 
verified by immunodetection using a monoclonal antibody recognising the human ER (a gift 

20 from P. Chambon, Strasbourg). The antibody recognises the "F" region of the LBD in the 
human ER, and also the "F" region tag at the N-termini of the DBD fusion proteins 30. Equal 
amounts of protein were electrophoresed on polyacrylamide gels and transferred to 
nitrocellulose for western blotting. The preparation of cell-free extracts by the glass bead 
method and the measurement of p-galactosidase activity in the extracts were performed as 

25 previously described 29. Two hybrid experiments were repeated several times, and the data 
shown in Figs, la and lb represent reporter activities as measured in a single representative 
experiment. The p-galactosidase activities are expressed as nmoles/minute/ng protein. 
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Example 6 

In Vitro Binding And Peptide Inhibition Assays 

GST-AF2 consists of the ligand binding domain of the mouse ER (amino acids 313- 
599) fused to glutathione-S-transferasc and has been described previously -^i. GST-CBP 
5 consists of GST fused to the SRC 1 -binding domain of CBP and was constructed by cloning a 
PGR fragment encoding residues 2058-2163 of mouse CBP into the vector pGEX2TK 
(Pharmacia). Human SRCla and SRCle cDNAs were isolated form a human B cell cDNA 
library and cloned into a modified version of the expression vector pSG5. SRCla M1234 and 
SRCleM123 were constructed by recombinant PGR to introduce the mutations [L636A, 

10 L637A, L693A, L694A, L752A, L753A, L1438A, LI439A] or [L636A, L637A, L693A, 
L694A, L752A, L753A] respectively. All SRCl constructs were fully sequenced- GST- 
SEPHAROSE™ beads were loaded with GST alone or GST-fusion proteins prepared from 
bacterial cell-free extracts. [35S]-labelled SRCl proteins were generated by irt vitro 
translation and tested for interaction with GST proteins in the presence or absence of 10-^M 

15 estradiol (E2) as previously described 21. Binding was carried out for 3 hours at 4® with 

gentle mixing in NETN buffer (100 mM NaCl, 1 mM EDTA, 0.5 % NP-40, 20 mM Tris HCl, 
pH 8.0) containing protease inhibitors in a final volume of 1 ml. Peptides P-1 and P-2 were 
dissolved in water at a concentration of 4mg/ml and added individually to GST-binding 
reactions immediately before the addition of ligand. The increasing amounts of pyeptide added 

20 in the competition experiments shown corres}X)nded to 2.5, 5, 12.5 and 25 |iM. 

Using analogous methodology, peptides KLVQLLTTT (SEQ ID NO: 3), ILHRLLQE 
(SEQ ID NO: 4) and LLQQLLTE (SEQ ID NO: 5) can be shown to be inhibitors. 

Example 7 

25 Transient Reporter Assays 

Hela cells were transfected with l|ig of reporter 2ERE-pS2-CAT 150 ng of P- 
galactosidase expression plasmid (internal control), 10 ng of ER expression plasmid and 50 or 
200 ng of SRCl expression plasmids or empty vector per well (in duplicate) using 24- well 
plates. Transfected cells were incubated overnight in Dulbecco" s modified Eagle's medium 

30 without phenol red and containing 10% charcoal -treated FBS, and washed in fresh medium 
before addition of ligand (10-8M ^2) or vehicle. After 40 hrs, cells were harvested and 
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extracts analysed for CAT and p-galactosidase activities I4, 21 . p.galactosidase activities wet 
used to correct for differences in transfection efficiency. 

Example 8 
5 Pharmaceutical Composition 

The following illustrates a representative pharmaceutical dosage fonn containing a 
peptide inhibitor and which may be used for therapy. 

hiiectable solution 

10 A sterile aqueous solution, for injection, containing per ml of solution: 

Peptide P-1 5.0mg 

Sodium acetate trihydrate 6.8mg 

Sodium chloride 7.2mg 

Tween 20 O.OSmg 

A typical dose of peptide for adult humans is 30mg. 

15 Example 9 

The strength of the interaction of coactivator signature motifs to NRs varies depending 
on the precise motif sequence 

The interaction between the ERa LBD and a range of different LXXLL motif fusion 
proteins appeared to yield different reporter gene activity indicating that the ERa LBD 
20 interacted with the LXXLL motifs with differing affinities (see Example 5), therefore the 
strength of these interactions was investigated more closely. The motifs of SRC-la were 
tested for their relative interaction specificities with the glucocorticoid receptor and estrogen 
receptor isoforms a and P in a yeast two hybrid assay. 


25 LexA DNA binding dornain. Motif fusion proteins were generated by ligation of annealed 
oligonucleotide pairs in frame into the ADH promoter driven LexA DBD vector YCpl4- 
ADH-LexA. YCpMADHl-LexA is a plasmid from which LexA is expressed under control of 


The SRC- la motifs 1-4 (SEQ ID 73-76) were expressed as fusion proteins with the 


• 
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the S, cerevisiae ADHl promoter. The backbone of this vector is the plasmid RS314 (Sikorski 
and Hieter, 1989). Between the Sac I and Kpn I restriction enzyme sites in the polylinker of 
this vector we have placed an expression cassette comprising the promoter of the S. cerevisiae 
ADHl gene, the coding region of the £. coli LexA gene and the transcription termination 
5 region region of the S. cerevisiae ADHl gene. The promoter region of ADHl consists of a 
1 .4kb Bam HI- Hind III fragment from the plasmid pADNS (Colicelli et al, 1989). Following 
the Hind III site is the coding region (amino acids 1 -202) of £. coli LexA (Horii et aly 1981 ; 
Miki et al^ 1981; Markham et al, 1981). The E. coli LexA fragment was obtained as a Hind 
III- Pst I fragment from the plasmid pBXLl (Martin et al, 1990). This sequence corresponds 
10 to nucleotides 95-710 of Genbank entry gl 46607. Following this sequence is a region which 
encodes a polylinker (SEQ ID 77). 

GAATTCCTGCAGCCCGGGGTCGACACTAGTTAACTAGCGGCCGC 

This polylinker adds the amino acids EFLQPGVDTS (SEQ ID NO: 80) to the carboxy 
terminus of LexA. The Not I site at the end of this linker is linked to a DNA fragment which 

1 5 includes the transcription terminator region of the S. cerevisiae ADHl gene. This fragment is 
the 0,6kb Not I-Bam HI from the plasmid pADNS (Colicelli et al, 1989). 

The NR LBDs were expressed as fusions with the Gal4 transcriptional AD. The LBD 
fusions were constructed by cloning a PGR fragment encoding the amino acids corresponding 
to the LBDs into YCpl5Gall-rl l.YCpl5Gall-rl 1 is a plasmid from which fusions of the 

20 activation region of the 5. cerevisiae Gal4 protein may be expressed under the control of the 
S. cerevisiae GAL J promoter. The backbone of this vector is the plasmid RS315 (Sikorski and 
Hieter, 1989). Between the Sac I and Kpn 1 restriction enzyme sites in the polylinker of this 
vector we have placed an expression cassette comprising the promoter of the S. cerevisiae 
GALI gene, the coding region of the fusion protein and the transcriptional termination region 

25 of the S. cerevisiae ADHl gene. The GALI promoter (Johnson and Davis, 1984;Yocum et a/, 
1984; West et al^ 1984) was obtained by amplification of a S cerevisiae genomic fragment by 
the polymerase chain reaction (PGR). This fragment corresponds to nucleotides 177 to 809 of 
Genbank database entry gl 71 546. This is followed by the sequence 
AAGCTTCCACCATGGTGCCAAAGAAGAAACGTAAAGTT (SEQ ID 78). 

30 This sequence provides a translation initiation codon and a sequence which encodes 

the amino acids MVPKKKRKV (SEQ ID NO: 81). The last seven residues of this peptide 
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correspond to a region identified as a nuclear localisation signal in the SV40 T antigen (amino 
acids 126-132 of Genbank entry g310678; Fiers et al, 1978; Reddy et al, 1978). This region is 
linked to sequence encoding the region II transcription activation domain (amino acids 768- 
881) of Gal4, as defined by Ma and Ptashne (1987). The sequence was isolated by PGR from 
5 plasmid pBXGalll (P. Broad unpublished) which is a mammalian version of the yeast 
expression vector pMA236 (Ma and Ptashne, 1 987) and corresponds to nucleotides 2744 to 
3085 of Genbank entry gl71557 (Laughon and Gesteland (1984). This sequence is followed 
by the polylinker 

TCTAGACTGCAGACTAGTAGATCTCCCGGGGCGGCCGC (SEQ ID 79). 
10 All fusion constructs were fully sequenced. The vectors YCpl4-ADH-lexA and 

YCplSGall-rl 1 replicate as a single copy plasmids in yeast(ARS-CEN) and have TRPl and 
LEU2 markers respectively. 

The S, cerevisiae strain MEYI32 (M. Egerton, Zeneca Pharmaceuticals, unpublished, 
genotype (My/a Ieu2-3J12 ura3'52 irpi his4 r/we7) v^s employed as a host strain. A 

1 5 reporter gene consisting of the E. coli lacZ (p-galactosidase) gene under the control of a 

promoter containing two binding sites for the lexA protein was integrated at the ura3 locus of 
this strain. The reporter gene plasmid, JP159-lexRE was constructed using plasmid JP159 as a 
backbone (J. Pearlberg, Ph.D. Thesis (1994) Harvard University). This is a shuttle vector 
which contains the S. cerevisiae URA3 as a marker. The plasmid contains the the E.coli p- 

20 galactosidase gene under the control of a minimal promoter containing the TATA box and 
transcription initiation site from the S. cerevisiae GAL J promoter (Dixon ei al 1997). 
Upstream of this promoter is a terminator from the GALll gene. Between the Xba I and Sal I 
sites in the promoter of this plasmid a 35 nucleotide sequence corresponding to a naturally 
occurring binding sites for the LexA protein in the promoter of the colicin El gene (Ebina et 

25 a/, 1983) the sequence corresponds to residues 20-54 of Genbank entry gl 44345). This 
sequence contains two LexA operators and is therefore referred to as "21ex". This reporter 
plasmid was linearised within the URA3 gene and integrated into the ura3 locus of MEY132 
to give the yeast strain MEY132-IexRE 

Transformants containing the 2 hybrid fusion constructs were grown to late log phase 

30 in 20ml selective medium (2% glucose and appropriate supplements). They were then diluted 
into 2% galactose containing medium, in the presence or absence of ligand. As controls each 
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LBD fusion was coexpressed with the LexA DBD lacking the coactivator motifs and 
similarly each motif LexA DBD fusion was coexpressed with Gal4 AD lacking the NR LBD. 
The relative expression levels of DBD fusion proteins were determined by inmiunodetection 
using a monoclonal antibody which recognises LexA. 
5 The estrogen receptor isoforms a and p relative interaction specificities for the 4 

motifs of SRC- la are ranked as follows, 2>4>1>3 (see Figure 5) . Glucocorticoid receptor 
specificities are as follows 4> 1=2=3. 

The yeast 2-hybrid system used in this Example utilises a single copy integrated 
reporter gene construct. This enables a quantitative comparison between the different yeast 

10 strains bearing the different SRCl a motifs. The data presented in Figure 5 suggest that motif 
3 (SRC la, 748-753) interacts very weakly, using this system, with ER. Longer exposures to 
oestradiol (not shown) do however reveal a significant interaction. It is noted that Figure 1 A 
herein clearly shows an interaction between motif 3 and ER but this interaction is significantly 
weaker than that seen with the other motifs. Any differences in this regard between Figures 5 

15 and 1 A can be explained by experimental design features. For example, vectors used to 
generate the Figure 5 data were low copy number (centromere containing) vectors, whereas 
vectors used to generate the Figure 1 A data were multicopy (2 micron) vectors. 
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